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Abstract — This paper presents and analyses the implemen- 
tation of a novel active queue management (AQM) named 
FavourQueue that aims to improve delay transfer of short lived 
TCP flows over a best-effort network. The idea is to dequeue in 
first packets that do not belong to a flow previously enqueued. 
The rationale is to mitigate the delay induced by long-lived TCP 
flows over the pace of short TCP data requests and to prevent 
dropped packets at the beginning of a connection and during 
recovery period. Although the main target of this AQM is to 
accelerate short TCP traffic, we show that FavourQueue does 
not only improve the performance of short TCP traffic but also 
improve the performance of all TCP traffic in terms of drop ratio 
and latency whatever the flow size. In particular, we demonstrate 
that FavourQueue reduces the loss of a retransmitted packet, 
decrease the RTO recovery ratio and improves the latency up to 
30% compared to DropTail. 



I. Introduction 

Internet is still dominated by web traffic running on top 
of short-lived TCP connections (T). Indeed, as shown in 
0, among 95% of the client TCP traffic and 70% of the 
server TCP traffic have a size lower than ten packets. This 
follows a common web design practice that is to keep viewed 
pages lightweight to improve interactive browsing in terms of 
response time 0. In other words, the access to a webpage 
often triggers several short web traffics that allow to keep 
the downloaded page small and to speed up the display of 
the text content compared to other heavier components that 
might compose iQ (e.g. pictures, multimedia content, design 
components). As a matter of fact and following the growth of 
the web content, we can still expect a large amount of short 
web traffic in the near future. 

TCP performance suffers significantly in the presence of 
bursty, non-adaptive cross-traffic or when the congestion win- 
dow is small (i.e. in the slow-start phase or when it operates 
in the small window regime). Indeed, bursty losses, or losses 
during the small window regime, may cause Rettansmission 
Timeouts (RTO) which trigger a slow-start phase. In the 
context of short TCP flows, TCP fast retransmit cannot be 
triggered if not enough packets are in transit. As a result, the 

'See for instance: "Best Practices for Speeding Up Your Web Site" from 
Yahoo developer network. 



loss recovery is mainly done thanks to the TCP RTO and 
this strongly impacts the delay. Following this, in this study 
we seek to improve the performance of this pervasive short 
TCP traffic without impacting on long-lived TCP flows. We 
aim to exploit router capabilities to enhance the performance 
of short TCP flows over a best-effort network, by giving a 
higher priority to a TCP packet if no other packet belonging 
to the same flow is already enqueued inside a router queue. The 
rationale is that isolated losses (for instance losses that occur 
at the early stage of the connection) have a strong impact on 
the TCP flow performance than losses inside a large window. 
Then, we propose an AQM, called FavourQueue, which allows 
to better protect packet retransmission and short TCP traffic 
when the network is severely congested. 

In order to give to the reader a clear view of the problem 
we tackle with our proposal, we lean on paper |]2]. Figure 
Q] shows that the flow duration (or latencjo of short TCP 
traffic is strongly impacted by an initial lost packet which 
is recovered later by an RTO. Indeed, at the early stage 
of the connection, the number of packets exchanged is to 
small to allow an accurate RTO estimation. Thus, an RTO 
is triggered by the default time value which is set to two 
seconds by default J4). In this figure, the authors also give 
the cumulative distribution function of TCP flow length and 
the probability density function of their completion time from 
an experimental measurement dataset obtained during one day 
on a ISP BRAS link which aggregates more than 30,000 users. 
We have reproduced a similar experiment with ns-2 (i.e. with 
a similar flow length CDF according to a Pareto distribution) 
and obtained a similar probability density function of the TCP 
flows duration as shown in Figure [2] for the DropTail queue 
curve. Both figures (Q~|and[2|i clearly highlight a latency peak at 
t = 3 seconds which corresponds to this default RTO value J4). 
In this experiment scenario, the RTO recovery ratio is equal 
to 56% (versus 70% in the experiments of J2]). As a matter 
of fact, these experiments show that the success of the TCP 
slow-start is a key performance indicator. The second curve 
in Figure [2] shows the result we obtain by using our proposal 
called FavourQueue. Clearly, the peak previously emphasized 
has disappeared. This means the initial losses that strongly 

2 The latency refers to the delay elapsed between the first sent and the last 
packet received. 
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impacted the TCP traffic performance have decreased. 

An important contribution of this work is the demonstration 
that our scheme, by favouring isolated TCP packets, decreases 
the latency by decreasing the loss ratio of short TCP flows 
without impacting long TCP traffic. However, as FavourQueue 
does not discriminate short from long TCP flows, every flows 
take advantage of this mechanism when entering either the 
slow-start or a recovery phase. Our evaluations show that 58% 
of short TCP flows improve their latency and that 80% of 
long-lived TCP flows also take advantage of this AQM. For 
all sizes of flows, on average, the expected gain of the transfer 
delay is about 30%. This gain results from the decrease of the 
drop ratio of non opportunistic flows which are those that less 
occupy the queue. Furthermore, the more the queue is loaded, 
the more FavourQueue has an impact. Indeed, when there is 
no congestion, FavourQueue does not have any effect on the 
traffic. In other words, this proposal is activated only when 
the network is severely congested. 

Finally, FavourQueue does not request any transport pro- 
tocol modification. Although we talk about giving a priority 
to certain packet, there is no per-flow state needed inside 
the FavourQueue router. This mechanism must be seen as 
an extension of DropTail that greatly enhances TCP sources 
performance by favouring (more than prioritizing) certain 
TCP packets. The next Section [TT] describes the design of 



the proposed scheme. Then, we presents in Section |TTT] the 
experimental methodology used in this paper. Sections |IV] 
and [V] dissects and analyses the performance of FavourQueue. 
Following these experiments and statistical analysis, we pro- 
pose a stochastic model of the mechanism Section [VI] We 
also present a related work in Section IVIII where we position 
FavourQueue with other propositions and in particular discuss 
how this AQM completes the action of recent proposals that 
aim to increase the TCP initial slow-start window. Finally, 
we propose to discuss the implementation and some security 
issues in Section IVIIII and conclude this work Section [IX] 

II. FavourQueue description 

Short TCP flows usually carry short TCP requests such as 
HTTP requests or interactive SSH or Telnet commands. As a 
result, their delay performance are mainly driven by: 

1) the end-to-end transfer delay. This delay can be reduced 
if the queueing delay of each router is low; 

2) the potential losses at the beginning connection. The first 
packets lost at the beginning of a TCP connection (i.e. in 
the slow-start phase) are mainly recovered by the RTO 
mechanism. Furthermore, as the RTO is initially set to 
a high value, this greatly decreases the performance of 
short TCP flows. 

The two main metrics on which we can act to minimize 
the end to end delay and protect from loss the first packets 
of a TCP connection and are respectively the queuing delay 
and the drop ratio. Consequently, the idea we develop with 
FavourQueue is to favor certain packet in order to accelerate 
the transfer delay by giving a preferential access to transmis- 
sion and to protect them from drop. 

This corresponds to implement a preferential access to 
transmission when a packet is enqueued and must be favoured 
(temporal priority) and a drop protection is provided when the 
queue is full (drop precedence) with push-out scheme that 
dequeue a standard packet in order to enqueue a favoured 
packet. 

When a packet is enqueued, a check is done on the whole 
queue to seek another packet from the same flow. If no 
other packet is found, it becomes a favoured packet. The 
rationale is to decrease the loss of a retransmitted packet 
in order to decrease the RTO recovery ratio. The proposed 
algorithm (given in Algorithm [TJ extends the one presented 
in |5) by adding a drop precedence to non-favoured packets 
in order to decrease the loss ratio of favoured packets. The 
selection of a favoured packet is done on a per-flow basis. 
As a result the complexity is as a function of the size of the 
queue which corresponds to the maximum number of state 
that the router must handle. The number of state is scalable 
considering today's routers capability to manage million of 
flows simultaneously |6|. However the selection decision is 
local and temporary as the state only exists when at least one 
packet is enqueued. This explains why we prefer the term of 
favouring packet more than prioritizing packet. Furthermore, 
FavourQueue does not introduced packet re-ordering inside a 
flow which obviously badly impacts TCP performance Q. 
Finally, in the specific case where all the traffic becomes 
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Algorithm 1 FavourQueue algorithm 

1: function enqueue(p) 

2: # A new packet p of flow F is received 

3: if less than 1 packet of F are present in the queue then 

4: # p is a favoured packet 

5: if the queue is full then 

6: if only favoured packets in the queue then 

7: p is drop 

8: return 

9: end if 

10: else 

11: # Push out 

12: the last standard packet is dropped 

13: end if 

14: p inserted in position pos_ 

15: pOS_ «— pOS_ +1 

16: else 

17: # p is a standard packet 

18: if the queue is not full then 

19: p is put at the end of the queue 

20: else 

21: p is dropped 

22: end if 

23: end if 



favoured, the behaviour of FavourQueue will be identical than 
DropTail. 

III. Experimental methodology 

We use ns-2 to evaluate the performance of FavourQueue. 
Our simulation model allows to apply different levels of 
load to efficiently compare FavourQueue with DropTail. The 
evaluations are done over a simple dumbbell topology. The 
network traffic is modeled in terms of flows where each flow 
corresponds to a TCP file transfer. We consider an isolated 
bottleneck link of capacity C in bit per second. The traffic 
demand, expressed as a bit rate, is the product of the flow 
arrival rate A and the average flow size E[a]. The load offered 
to the link is then defined by the following ratio: 



The load is changed by varying the arrival flow rate fl8]. 
Thus, the congestion level increases as a function of the load. 
As all flows are independent, the flow arrivals are modeled by a 
Poisson process. A reasonable fit to the heavy-tail distribution 
of the flow size observed in practice is provided by the Pareto 
distribution. The shape parameter is set to 1.3 and the mean 
size to 30 packets. Left side in Figure [2] gives the flows' size 
distribution used in the simulation model. 

At the TCP flow level, the ns-2 TCP connection establish- 
ment phase is enabled and the initial congestion window size 
is set to two packets. As a result, the TCP SYN packet is 
taken into account in all dataset. The load introduced in the 
network consists in several flows with different RTT according 
to the recommendation given in the "Common TCP evaluation 
suite" paper J8). The load is ranging from 0.05 to 0.95 with a 



step of 0.1. The simulation is bounded to 500 seconds for each 
given load. To remove both TCP feedback synchronization and 
phase effect, a traffic load of 10% is generated in the opposite 
direction. The flows in the transient phase are removed from 
the analysis. More precisely, only flows starting after the first 
forty seconds are used in the analysis. The bottleneck link 
capacity is set to 10Mbps. All other links have a capacity 
of 100Mbps. According to the small buffers rule J9], buffers 
can be reduced by a factor of ten. The rule of thumb says 
the buffer size B can be set to T x C with T the round- 
trip propagation delay and C the link capacity. We choose 
T = 100ms as it corresponds to the averaged RTT of the 
flows in the experiment. The buffer size at the two routers is 
set to a bandwidth-delay product with a delay of IOtos. The 
packet length is fixed to 1500 bytes and the buffer size has a 
length of 8 packets. 

To improve the confidence of these statistical results, each 
experiment for a given load is done ten times using different 
sequences of pseudo-random numbers (in the following we 
talk about ten replications experiment). Some figures also 
average the ten replications, meaning that we aggregate and 
average all flows from all ten replications and for all load 
conditions. In this case, we talk about ten averaged experiment 
results which represents a dataset of nearly 17 million of 
packets. The rationale is to consider these data as a real 
measurement capture where the load is varying as a function 
of time (as in (2J) since each load condition has the same 
duration. In other words, this represents a global network 
behaviour. 

The purpose of these experiments is to weight up the 
benefits brought by our scheme in the context of TCP best- 
effort flows. To do this, we first experiment a given scenario 
with DropTail then, we compare with the results obtained with 
FavourQueue. We enable FavourQueue only on the uplink 
(data path) while DropTail always remains on the downlink 
(ACK path). We only compare all identical terminating flows 
for both experiments (i.e. DropTail and FavourQueue) in order 
to assess the performance obtained in terms of service for a 
same set of flows. 

We assume our model follows Internet short TCP flows 
characteristics as we find the same general distribution latency 
form than Figure [T] which is as a function of the measurements 
obtained in Figure |2j This comparison provides a correct 
validation model in terms of latency. As explained above, 
Figure|2]corresponds and illustrates a ten averaged experiment. 

IV. Performance evaluation of TCP flows with 
FavourQueue 

We present in this section global performance obtained by 
FavourQueue then we deeper analyze its performance and 
investigate the case of persistent flows. We compare a same 
set of flows to assess the performance obtained with DropTail 
and FavourQueue. 

A. Overall performance 

We are interested in assessing the performance of each TCP 
flows in terms of latency and goodput. We recall from Section 
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U that we defined the latency as the time to complete a data 
download (i.e the transmission time) and the goodput is the 
average pace of the download. In order to assess the overall 
performance of FavourQueue compared to DropTail, Figure 
13 gives the mean and standard deviation of the latency as a 
function of the traffic load of FavourQueue. We both study 
FavourQueue with and without the push-out mechanism in 
order to distinguish the supplementary gain provided by the 
drop precedence. 

The results are unequivocal. FavourQueue version without 
push-out as presented in Q provides a gain when the load 
increases compared to DropTail (i.e. when the queue has 
a significant probability of having a non-zero length) while 
the drop precedence (with push-out) clearly brings out a 
significant gain in terms of latency. Basically, Figure |4] shows 
that both queues (with and without push-out) globally drop the 
same amount of packets. However, the push-out version better 
protects short TCP flow (and more generally: all flows entering 
a slow-start phase) as when the queue is congested, it always 
enqueues a packet from a new flow. As a result, initial lost 
of packets further decreases. Indeed, as already emphasized 
in Figure [2] from the introduction, losses do not occur at the 
beginning of a connection and as a result, the flow is not 
impacted anymore by the retransmission overhead resulting 
from an RTO. Thus, our favouring scheme allows to prevent 
lost packets during the startup phase. As a matter of fact, this 
is explained by a different distribution of these losses. 



Following this, we have computed the resulting normalized 
goodput for all flows size for all experiments and obtained is 
2.4% with DropTail and 3.5% with FavourQueue (i.e. around 
1% of difference). This value is not weak as it corresponds to 
an increase of 45%. 

Figure [5] gives the average latency as a function of the flow 
length. The cumulative distribution function of the flow length 
is also represented. On average, we observe that FavourQueue 
obtains a lower latency than DropTail whatever the flow length. 
This difference is also larger for the short TCP flows which 
are also numerous (we recall that the distribution of the flows' 
size follows a Pareto distribution and as a result the number of 
short TCP flow is higher). This demonstrates that FavourQueue 
particularly favors the slow-start of every flow and as a matter 
of fact: short TCP flows. The cloud pattern obtained for a 
flow size higher than hundred is due to the decrease of the 
statistical sample (following the Pareto distribution used for 
the experiment) that result in a greater dispersion of the results 
obtained. As a result, we cannot drive a consistent latency 
analysis for sizes higher than hundred. 

To complete these results, Figure [6] gives the latency ob- 
tained when we increase the size of both queues. We observe 
that whatever the queue size, FavourQueue always obtains 
a lower latency. Behind a given queuesize (in Figure [6] at 
x = 60), the increase of the queue does not have an impact 
on the latency. This enforces the consistency of the solution 
as Internet routers prevent the use of large queue size. 
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B. Performance analysis 

To refine our analysis of the latency, we propose to evaluate 
the difference of latencies per flows for both queues. We 
denote Ai = Tdj — Tfo with Td and Tf the latency observed 
respectively by DropTail and FavourQueue for a given flow i. 
Figure [7] gives the distribution of the latencies difference. This 
figure illustrates that there is more decrease of the latency for 
each flow than increase. Furthermore for 16% of flows, there 
is no impact on the latency i.e. A = 0. In other words, 84% 
of flows observe a change of latency; 55% of flows observe 
a decrease (A > 0) and 10% of flows observe a significant 
change (A > 1 second). However, 30% of the flows observe an 
increase of their latency (A < 0). In summary, FavourQueue 
has a positive impact on certain flows that are penalised with 
DropTail. 
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Fig. 7. Cumulative distribution function of latency difference A. 

In order to assess the flows that gain in terms of latency, 
Figure [8] gives the probability of latency improvement. For 
the whole set of short TCP flows, (i.e. with a size lower than 
10 packets), the probability to improve the latency reaches 
58% while the probability to decrease is 25%. For long TCP 
flows (i.e. above 100 packets), the probability to improve and 
to decrease the latency is respectively 80% and 20%. The 
flows with a size around 30 packets are the ones with the 
highest probability to be penalised. For long TCP flows, the 
large variation of the probability indicates a uncertainty which 
mainly depends on the experimental conditions of the flows. 
We have to remark that long TCP flows are less present in 
this experimental model (approximately 2% of the flows have a 
size higher or equal to 100 packets). As this curve corresponds 
to a ten averaged experiment, each long TCP flows have 
experienced various load conditions and this explains these 
large oscillations. 

Medium sized flows are characterized by a predominance 
of the slow-start phase. During this phase, each flow oppor- 
tunistically occupies the queue and as a results less packets are 
favoured due to the growth of the TCP window. The increase 
of the latency observed for medium sized flows (ranging from 
10 to 100) is investigated later in subsection I V-AI We will also 
see in the next subsection IIV-CI that FavourQueue acts like a 
shaper for these particular flows. 



To estimate the latency variation, we define G(x) the latency 
gain for the flows of length x as follows: 



G(x) 



(2) 



with TV, the number of flows of length x. A positive gain 
indicates a decrease of the latency with FavourQueue. Figure 
[9] provides the positive, negative and total gains as a function 
of the flows size. We observe an important total gain for the 
short TCP flows. The flows with an average size obtain the 
highest negative gain and this gain also decreases when the 
size of the flows increases. Although some short flows observe 
an increase of their latency, in a general manner, the positive 
gain is always higher. This preliminary analysis illustrates 
that FavourQueue improves by 30% on average the best-effort 
service in terms of latency. The flows that take the biggest 
advantage of this scheme are the short flows with a gain up 
to 55%. 
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Fig. 9. Average Latency gain per flow length. 

Finally and to conclude with this section, we plot in Figure 
[TOl the number of flows in the system under both AQM as a 
function of time to assess the change in the stability of the 
network. We observe that FavourQueue considerably reduces 
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both the average number of flows in the network as well as 
the variability. 
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Fig. 10. Number of simultaneous flows in the network. 



C. The case of persistent flows 
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Fig. 11. Number of short flows in the network when persistent flows are 
actives. 

Following 1 10], we evaluate how the proposed scheme 
affects persistent flows with randomly arriving short TCP 
flows. We now change the network conditions with 20% of 
short TCP flows with exponentially distributed flow sizes with 
a mean of 6 packets. Fourty seconds later, 50 persistent flows 
are sent. Figure QT| gives the number of simultaneous short 
flows in the network. When the 50 persistent flows start, the 
number of short flows increases and oscillates around 100 
with DropTail. By using FavourQueue, the number increases 
to 30 short flows. The short flows still take advantage of the 
favour scheme and Figure [T2] confirms this point. However we 
observe in Figure [l3]that the persistent flows are not penalized. 
The mean throughput is nearly the same (1.81% for DropTail 
versus 1.86% for FavourQueue) and the variance is smaller 
with FavourQueue. Basically, FavourQueue acts as a shaper by 
slowing down opportunistic flows while decreasing the drop 
ratio of non opportunistic flows (those which less occupy the 
queue). 
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Fig. 13. Mean throughput as a function of flow length for 50 persistent 
flows. 



V. Understanding FavourQueue 

The previous section has shown the benefits obtained with 
FavourQueue in terms of service. In this section, we analyse 
the reasons of the improvements brought by FavourQueue by 
looking at the AQM performance. We study the drop ratio 
and the queueing delay obtained by both queues in order to 
assess the reasons of the gain obtained by FavourQueue. We 
recall that for all experiments, FavourQueue is only set on the 
upstream. The reverse path uses a DropTail queue. In a first 
part, we look at the impact of the AQM on the network then 
on the end-host. 



A. Impact on the network 

Figure[l4]shows the evolution of the average queueing delay 
depending on the size of the flow. This figure corresponds to 
the 10 averaged replications experiment (as defined Section 
Till) . Basically, the results obtained by FavourQueue and Drop- 
Tail are similar. Indeed, the average queueing delay is 2.8ms 
for FavourQueue versus 2.9ms for DropTail and both curves 
similarly behave. We can notice that the queueing delay for 
the medium sized flows slightly increases with FavourQueue. 
These flows are characterized by a predominance of the slow- 
start phase as most of the packets that belong to these flows 
are emitted during the slow-start. Since during this phase 
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each flow opportunistically occupies the queue, less packets 
are favoured due to the growth of the TCP window. As a 
result, their queuing delay increases. When the size of the 
flow increases (above hundred packets length), the slow-start 
is not pervasive anymore and the average queueing delay of 
each packet of these flows tends to be either higher or lower 
as suggested by the cloud Figure [14] depending on the number 
of favoured packets during their congestion avoidance phase. 
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Fig. 14. Average queuing delay according to flow length. 

0.2 i 



++++ 

DropTail t * 




100 1000 
Flow length (pkt) 

Fig. 15. Average drop ratio according to flow length. 

However and as suggests Figure Q3] the good performance 
in terms of latency obtained by FavourQueue previously shown 
Figure|3]in Section [TV] are mostly due to a significant decrease 
of the drop ratio. If we look at the average drop ratio of both 
queues in Figure Q3] still as a function of the flow length, we 
clearly observe that the number of packets dropped is lower 
for FavourQueue. Furthermore, the loss ratio for the flow size 
of 2 packets is about 10~ 3 meaning that the flows of this size 
obtain a benefit compared to DropTail. The slow-start phase 
is known to send burst of data ifTTI . Thus, most of the packets 
sent during the slow-start phase have a high probability to 
be not favoured. This explains the increase of the drop ratio 
according to the flow size until 60 packets. Indeed, in the 
slow-start phase, packets are sent by burst of two packets. As 
a result, the first packet is favoured and the second one will 
be favoured only if the first is already served. Otherwise, the 
second packet might be therefore delayed. Then, when their 
respective acknowledgements are back to the source, the next 



sending will be more spaced. As FavourQueue might decrease 
the burstiness of the slow-start, we might decrease the packet 
loss rate and thus improve short TCP flow performance. 

If we conjointly consider both figures [14] and Q3] we 
observe that FavourQueue enables a kind of traffic shaping 
that decreases TCP aggressivity during the slow-start phase 
which results in a decrease of the number of dropped packets. 
As the TCP goodput is proportional to 1/ (BTT.jp) ifHl . 
the decrease of the drop ratio leads to an increase of the 
goodput which explains the good performance obtained by 
FavourQueue in terms of latency. 

The loss ratio of SYN segments is on average 1.8% with 
DropTail. However for a load higher than 0.75, this loss 
ratio value reaches 2.09% while with FavourQueue, this ratio 
is 0.06%. Finally on average for all load conditions, this 
value is 0.04% with FavourQueue. These results demonstrate 
the positive effect to protect SYN segments from the loss. 
Obviously, by using FavourQueue in duplex mode (we recall 
that we have tested FavourQueue only on the upstream), this 
would further improve the results as SYN/ACK packets would 
have also been protected. 

B. Impact on the end-host performance 

The good performance obtained with FavourQueue in terms 
of latency are linked to the decrease of the losses at the 
beginning of the flow. In the following, we propose to estimate 
the benefits of our scheme by estimating the RTO ratio as a 
function of the network load. We define the RTO ratio T(p) 
for a given load p as follows: 



T(P) 



R, 



(3) 



with RTOi, the number of RTO for the i flow; R, its 
number of retransmission and Li its size. The ten replica- 
tions experiment in Figure Q~6] presents the evolution of the 
RTO ratio for FavourQueue and DropTail and shows that the 
decrease of the loss ratio results in a decrease of the RTO 
ratio for FavourQueue. This also shows the advantage to use 
FavourQueue when the network is heavily loaded. 

We now evaluate the RTO recovery ratio as a function of 
the flow length. We define this RTO recovery ratio as follows: 



t(x) 



Sill (RTOi + FBi) 

with FRi, the number of TCP Fast Retransmit for the i flow. In 
terms of RTO recovery, Figure [l7]shows a significant decrease 
of the number of recovery with an RTO. Concerning the 
ratio of Fast Retransmit for this experiment, we observe an 
increase of 14% with FavourQueue. As a fast recovery packet 
is placed at the beginning of a window, FavourQueue prevents 
the loss of a retransmission. Then, the number of recovery with 
Fast Retransmit is higher with FavourQueue and the latency 
observed is better since the retransmission are faster. 

For the flow with a size strictly below six packets, the 
recovery is exclusively done by a RTO. Indeed, in this case 
there is not enough duplicate acknowledgement to trigger a 
Fast Retransmit. For the flow above six packets, we observe 
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Fig. 16. RTO ratio as a function of the network load. 
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Fig. 17. RTO recovery ratio according to flow length. 

a noticeable decrease of the RTO ratio due to the decrease of 
the packet lost rate on the first packets of the flow. Thus, the 
number of duplicate acknowledgement is higher, allowing to 
trigger a Fast Retransmit recovery phase. The trend shows 
a global decrease of the RTO ratio when the flow length 
increases. On the overall, the RTO recovery ratio reaches 56% 
for Drop Tail and 38% for FavourQueue. The decrease of the 
gain obtained follows the increase of the flow size. This means 
that FavourQueue helps the connection establishment phase. 

VI. Stochastic model of FavourQueue 

We analyze in this part the impacts of the temporal and 
drop priorities previously defined in Section |II] and propose a 
stochastic model of the mechanism. 

A. Preliminary statistical analysis 

We first estimate the probability to favour a flow as a 
function of its length by a statistical analysis. We define 
P(Favor\S = x), the probability to favour a flow of size 
s, as follows: 

P(Favor\S = s) = . (5) 

with /a;, the number of packets which have been favoured 
and Ri the number of retransmitted packets of a given i flow. 



The number of favoured packets corresponds to the number 
of packets selected to be favoured at the router queue. Figure 
[18] gives the results obtained and shows that: 

• the flows with a size of two packets are always favoured; 

• the middle sized flows that mainly remain in a slow- 
start phase are less favoured compared to short flows. 
The ratio reaches 50% meaning that one packet among 
two is favoured; 

• long TCP flows get a favouring ratio around 70%. 
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Fig. 18. Probability of packet favouring according to flow length. 
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Fig. 19. Push-out proportion of drop as a function of flow length. 

We also investigate the ratio of packets dropped resulting 
from the push-out algorithm as a function of the flow length 
in order to assess whether some flows are more penalised by 
push-out. As shown, Figure [19] the mean is about 30% for all 
flows, meaning that the push-out algorithm does not impact 
more short than long TCP flows. 

We now propose to build a stochastic model of Figure [18] 
in the following. 

B. Stochastic model 

We denote S: the random variable of the size of the flow 
and Z: the Bernoulli random variable which is equal to if 
no favoured packets are present in the queue and 1 otherwise. 
We then distinguish three different phases: 

• phase #1 : each flows have a size lower than s\. In this 
phase, the flows are in slow-start mode. This size is a 
parameter of the model which depends of the load. ; 
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• phase #2 : each flows have a size higher than si and lower 
than S2- In this phase, flows progressively leave the slow- 
start mode (corresponding to the bowl between [10 : 100] 
in Figure [18). This is the most complex phase to model 
as all flows are either in the congestion avoidance phase 
or at the end of their slow-start. S2 is also a parameter of 
the model which depends of the load; 

• phase #3 : each flows have a size higher than S2- All 
flows are in congestion avoidance phase. Note that the 
statistical sample which represents this cloud is not large 
enough to correctly model this part (as already pointed 
out in Section llV-AI ). However, one other important result 
given by Figure Q~8] is that 70% of packets of flows in 
congestion avoidance mode are favoured. We will use 
this information to infer the model. This also confirms 
that the spacing between each packet in the congestion 
avoidance phase increases the probability of an arriving 
packet to be favoured. 

First phase: We consider a bursty arrival and assume that 
all packets belonging to the previous RTT have left the queue. 
Then, the burst size (BS) can take the following values: BS = 
1,2,4,8,16,32,.... If Z = 0, we assume that a maximum 
of 3 packets can be favoured in a rowH, the packets number 
that are favoured are 1,2,3,4,5,6,8,9,10,16,17,18,... and 
1, 2, 4, 8, 16, 32, ... if Z = 1. Thus, if Z = 0, the probability 
to favour a packet of a flow of size s is: 



P{Favor\(Z = 0,5= s)) 



and with Z = 1: 



s, s ^ 6 
^,7 ^ s «C 10 

§, ii s; s ^ 15 

4^, 16 < s < 18 

#,i9«; s s;3i 

(...) 



P(Favor\(Z = 1,5= s)) = < 



1,8=1 

f,2 < s < 3 

f,4s; s <7 

f,8< s 15 
f,16< s sC31 
(...) 



(7) 



The probability to favour a packet of a flow of size ,s is 
thus: 

P(Favor\S = s) = P(Z = 0).P(Favor\(Z = 0, S = s)) + (8) 
P(Z = l).P(Favor\(Z = l,S= s)) 

Once again, P(Z = 0) and P(Z = 1) depends on the load 
of the experiment and must be given. 

Second phase: In this phase, each flow progressively leaves 
the slow-start phase. First, when a flow finishes its slow-start 
phase, each following packets have a probability to be favoured 
of 70% (as shown in in Figure [18). So, we now need to 

3 The rationale is the following, if Z = a single packet (such as the SYN 
packet) is favoured and one RTT later, the burst of two packets (or larger) 
will be favoured if we consider that the first packet of this burst is directly 
served. 



compute an average value of the probabilty to favour a packet 
for a given flow. We also have to take into account that, for 
a given size of flow s, only a proportion of these flows have 
effectively left the slow-start phase. The other ones remain 
in slow-start and the analysis of their probabilty to favour a 
packet follows the first phase. To correctly describe this phase, 
we need to assess which part of flows of size s, Si ^ s ^ S2, 
has left the slow start phase at packet s\, Si + 1, ... s. As a 
first approximation, we use a uniform distribution between si 
and S2- This means that for flows of size s, the proportion of 
flows which have left the slow-start phase at si, Si + 1, ... 

— and the proportion of flows of size s which 



s — 1 is — 

have not yet left the slow-start phase is thus S2 ~ s , 

■> r S2— si 

If we denote the proportion of flows of size s ^ si that 
have left the slow start-phase at k we have: 

P(Favor\(S = s,Z = 0)) = 

s — si — 1 

^2 p k .P{Favor\k = Si + i,Z = 0,S = s) 



and 



P(Favor\{S = s,Z= 1)) = 

S — Si — 1 

p k .P(Favor\k = si +i, Z = 1,5 = s) 

i=0 

and as in (|8) we obtain: 



P(Favor\S = s) = 



-si-l 



(6) P(Z = 0). Pk-P{Favor\k = ai +i,Z = Q,S = s) + 

s — si — 1 

P(Z = 1). Pk-P(Favor\k = si + i, Z = 1, 8 = s) 



Third phase: The model of this phase is quite simple. In 
fact, each packet of a flow which have left the slow-start phase 
have a probability to be favoured of 70%. We compute the 
probability for a packet to be favoured by taking into account 
the time at which a flow has left the slow-start phase and the 
proportion of flows as in the second phase. 

Model fitting: To verify our model, among the ten loads that 
are averaged in Figure [18] we choose two verify our model 
for two loads: p = 0.25 and p = 0.85. For the first one we 
have estimated P(Z = 1) = 0.25 and P(Z = 1) = 0.7 for 
the second. Figures [20] and [2T| show that our model correctly 
fits both experiments. 

This model allows to understand the peaks in Figure [20] 
when the flow size is lower than hundred packets. These peaks 
are explained by the modelling of the first phase. Indeed, the 
traffic during the slow-start is bursty, then, each burst has either 
one or two packets favoured as a function of Z (i.e. up to three 
packets are favoured when Z = and only one when Z = 1 
as given by © and (|7)). 



VII. Related work 

Several improvements have been proposed in the literature 
and at the IETF to attempt to solve the problem of short 
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Fig. 20. Model fitting for p = 0.25 with P(Z = 1) = 0.25. 
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Fig. 21. Model fitting for p = 0.85 with P(Z = 1) = 0.7. 



TCP flows performance. Existing solutions can be classified 
into three different action types: (1) to enable a scheduling 
algorithm at the router queue level; (2) to give a priority 
to certain TCP packets or (3) to act at the TCP level in 
order to decrease the number of RTO or the loss probability. 
Concerning the two first items, the solution involves the core 
network while the third one involves modifications at the 
end-host. In this related work, we first situate FaQ among 
several core network solutions and then explain how FaQ 
might complete end-hosts' solutions. 

A. Enhancing short TCP flows performance inside the core 
network 

1 ) The case of short and long TCP flows differentiation: 
Several studies lfT3llfl0ll[l41 have proposed to serve first short 
TCP traffic to improve the overall system performance. These 
studies follow one queueing theory result which stands that the 
overall mean latency is reduced when the shortest job is served 
first (T51 . One of the precursor in the area is lfi"4l . where the 
authors proposed to adapt the Least Attained Service (LAS) 
lfT31 . which is a scheduling mechanism that favors short jobs 
without prior knowledge of job sizes, for packets networks. 
As for FavourQueue, LAS is not only a scheduling discipline 
but a buffer management mechanism. This mechanism follows 
FavourQueue principle since the priority given to the packet 
is done without knowlegde of the size of the flow and that 



the classification is closely related to the buffer management 
scheme. However, the next packet serviced under LAS is 
the one that belongs to the flow that has received the least 
amount of service. By this definition, LAS will serve packets 
from a newly arriving flow until that flow has received an 
amount of service equal to the amount of least service received 
by a flow in the system before its arrival. Compared to 
LAS, FavourQueue has no notion of amount of service as 
we seek to favour short job by accelerating their connection 
establishement. Thus, there is no configuration and no complex 
settings. 

In lfl3l and 1101 . the authors push further the same idea and 
attempt to differentiate short from long TCP flows according 
to a scheduling algorithm. The differences between these 
solutions are based on the number of queues used which 
are either flow stateless or stateful. Theses solutions uses an 
AQM which enables a push out algorithm to protect short TCP 
flow packets from loss. Short TCP flows identification is done 
inside the router by looking at the TCP sequence number ifTO) . 
However and in order to correctly distinguish short from long 
TCP flows, the authors modify the standard TCP sequence 
numbering which involves a major modification of the TCP/IP 
stack. In 1131 . the authors propose another solution with a 
per-flow state and deficit round robin (DRR) scheduling to 
provide fairness guarantee. The main drawback of llT4lllT3l is 
the need of a per-flow state while iflOl requires TCP senders 
modifications. 

2) The case of giving a priority to certain TCP packets: 
Giving a priority to certain TCP packets is not a novel idea. 
Several studies have tackled the benefit of this concept to 
improve the performance of TCP connection. This approach 
was really popular during the QoS networks research epoch 
as many queueing disciplines was enabled over IntServ and 
DiffServ testbed allowing researchers to investigate such pri- 
ority effects. Basically, the priority can be set intra-flow or 
inter-flow. Marco Mellia et Al. Ifl6l have proposed to use 
intra-flow priority in order to protect from loss some key 
identified packets of a TCP connection in order to increase 
the TCP throughput of a flow over an AF DiffServ class. In 
this study, the authors observe that TCP performance suffers 
significantly in the presence of bursty, non-adaptive cross- 
traffic or when it operates in the small window regime, i.e., 
when the congestion window is small. The main argument is 
that bursty losses, or losses during the small window regime, 
may cause retransmission timeouts (RTOs) which will result in 
TCP entering the slow-start phase. As a possible solution, the 
authors propose qualitative enhancements to protect against 
loss: the first several packets of the flow in order to allow 
TCP to safely exit the initial small window regime; several 
packets after an RTO occurs to make sure that the retransmitted 
packet is delivered with high probability and that TCP sender 
exits the small window regime; several packets after receiving 
three duplicate acknowledgement packets in order to protect 
the retransmission. This allows to protect against losses the 
packets that strongly impact on the average TCP throughput. 
In l3lllfl7l . the authors propose a solution on inter-flow priority. 
The short TCP flow are marked IN. Thus, packets from these 
flows are marked as a low drop priority. The differentiation in 
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core routers is applied by an active queue management. When 
sender has sent a number of packets that exceeds the flow 
identification threshold, the packet are marked OUT and the 
drop probability increase. However, these approaches need the 
support of a DiffServ architecture to perform [fi"8l . 

B. Acting at the TCP level 

The last solution is to act at the TCP level. The first 
possibility is to improve the behavior of TCP when a packet 
is dropped during this start up phase (i.e. initial window size, 
limited transit). The second one is to prevent this drop by 
decreasing the probability of segments lost. For instance, in 
fl9l , the authors propose to apply an ECN mark to SYN/ACK 
segments in order to avoid to drop them. The main drawback 
of these solutions is that they require important TCP sender 
modifications that might involve heavy standardisation pro- 
cess. 

We wish to point out that one of the current hot topic 
currently discussed within the Internet Congestion Control 
Research Group (ICCRG) deals with the TCP initial window 
size. In a recent survey, the authors of ||20) highlight that the 
problem of short-lived flows is still not yet fully investigated 
and that the congestion control schemes developed so far do 
not really work if the connection lifetime is only one or 
two RTTs. Clearly, they argue for further investigation on 
the impact of initial value of the congestion window on the 
performance of short-lived flows. Some recent studies have 
also demonstrated that larger initial TCP window helps faster 
recovery of packet losses and as a result improves the latency 
in spite of increased packet losses [21 1, [22|. Several proposals 
have also proposed solutions to mitigate the impact of the slow 
start ED, El, 11251. 

Although we do not act at the end-host side, we share the 
common goal to reduce latency during the slow start phase of 
a short TCP connection. However, we do not target the same 
objective. Indeed, end-host solutions, that propose to increase 
the number of packets of the initial window, seek to mitigate 
the impact of the RTT loop while we seek to favour short 
TCP traffic when the network is congested. At the early stage 
of the connection, the number of packets exchanged is low 
and a short TCP request is both constrained by the RTT loop 
and the small amount of data exchange. Thus, some studies 
propose to increase this initial window value ED . El : to 
change the pace at which the slow-start sends data packets 
by shrinking the timescale at which TCP operates 11261 : even 
to completely suppress the slow-start [24] . Basically, all these 
proposals attempt to mitigate the impact of the slow-start loop 
that might be counterproductive over large bandwidth product 
networks. On the contrary, FavourQueue do not act on the 
number of data exchanged but prevents losses at the beginning 
of the connection. As a result, we believe that FavourQueue 
must not be seen as a competitor of these end-host proposals 
but as a complementary mechanism. We propose to illustrate 
this complementarity by looking at the performance obtained 
with an initial congestion window sets to ten packets. Figure 
|22] gives the complementary cumulative distribution function 
of the latency for DropTail and FavourQueue with flows with 



an initial slow-start set to two or ten packets. We do not have 
changed the experimental conditions (i.e. the router buffer is 
still set to eight packets) and this experiment corresponds to 
a ten averaged experiments (see section ITTTb . As explained in 
l2Tl . if we focus on the results obtained with DropTail for 
both initial window size, the increase of the initial window 
improves the latency (with the price of an increase of the loss 
rate as also denoted in lOTl ). However, the use of FavourQueue 
enforces the performance obtained and complement the action 
of such end-host modifications making FavourQueue a generic 
solution to improve short TCP traffic whatever the slow-start 
variant used. 
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Fig. 22. Comparison of the benefit obtained in terms of latency with an 
initial TCP window size of ten packets. 



VIII. Discussion 

A. Security consideration 

In the related work presented in Section IVII1 we present 
a similar solution to our proposal that gives priority to TCP 
packets with a SYN flag set. One of the main criticism that 
raises such kind of proposals usually deal with TCP SYN flood 
attack where TCP SYN packets may be used by malicious 
clients to improve this kind of threat El . However, this is a 
false problem as accelerating these packets do not introduce 
any novel security or stability side-effects as explained in 
El . Today, current kernel enables protection to mitigate such 
well-known denial of service attack0 and current Intrusion 
Detection Systems (IDS) such as SNORT0 combined with 
firewall rules allow network providers and companies to stop 
such attack. Indeed, the core network should not be involved 
in such end-host security issue that should remain under the 
reponsability of edge networks and end-hosts. Concerning the 
reverse path and as raised in El , provoking web servers or 
hosts to send SYN/ACK packets to third parties in order to 
perform a SYN/ACK flood attack would be greatly inefficient. 
This is because the third parties would immediately drop such 
packets, since they would know that they did not generate the 
TCP SYN packets in the first place. 

4 See for instance http://www.symantec.com/connect/articles/hardening-tcpip-stack-syn-atta 
: http://www.snort.org/ 
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B. Deployment issue 

Although there is no scalability issue anymore inside new 
Internet routers that can manage millions of per-fiow state _5). 
FavourQueue does not involve per-fiow state management and 
the number of entries that need to handle a FavourQueue router 
is as a function of the number of packets that can be enqueued. 
Furthermore, as the size of a router buffer should be small [__|, 
the number of states that need to be handle is thus bounded. 

To sump up, the proposed scheme respects the following 
constraints: 

• easily and quickly deployable; this means that 
FavourQueue has no tuning parameter and does 
not require any protocol modification at a transport or a 
network level; 

• independently deployable: installation can be done with- 
out any coordination between network operators. Opera- 
tion must be done without any signaling; 

• scalable; no per-fiow state is needed. 

FavourQueue should be of interest for access networks; 
entreprise networks or universities where congestion might 
occur at their output Internet link. 

IX. Conclusion 

In this paper, we investigate a solution to accelerate short 
TCP flows. The main advantages of the proposed AQM is that 
FavourQueue is stateless; does not require any modification 
inside TCP; can be used over a best effort network; does 
not request to be completely deployed over an Internet path. 
Indeed, a partial deployment could only be done over routers 
from an Internet service provider or over a specific AS. 

We drive several simulation scenarios showing that the 
drop ratio decreases for all flow length, thus decreasing their 
latency. FavourQueue significantly improves the performance 
of short TCP traffic in terms of transfer delay. The main 
reasons are that this mechanism strongly reduces the loss of 
a retransmitted packet triggered by an RTO and improves 
the connection establishment delay. Although FavourQueue 
targets short TCP performance, results show that by protecting 
retransmitted packets, the latency of the whole traffic and 
particularly non-opportunistic flows, is improved. 

In a future work, we aim at investigating FavourQueue with 
rate-based transport protocols such as TFRC in order to verify 
whether we would benefit similar properties and with delay- 
based TCP protocol variants (such as TCP Vegas and TCP 
Compound) that should intuitively take large benefit of such 
AQM. We also expect to enable ECN support in FavourQueue. 
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